A Comprehensive Roman (english)-to-bangla Transliteration Scheme

نویسنده

  • Naushad UzZaman
چکیده

A transliteration scheme from Roman (English) to Bangla can help increase the use of Bangla in essential and diverse computing areas such as word processing, Internet and mobile communication and information query and retrieval. The Bangla script’s irregular phonetic nature and its large repertoire of consonant clusters (juktakkhors) create a large gap between the pronunciation and the orthography for a given Bangla word. In this paper, we describe a comprehensive Roman (English)-to-Bangla transliteration scheme that is designed to handle the full complexity of the Bangla script. We apply a phonetic encoding scheme to produce intermediate code-strings that facilitate matching pronunciations of input strings and the desired outputs. We also provide graceful degradation to a more conventional direct phonetic mapping in special circumstances. A prototype of our scheme shows significant success in test cases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phonetic Bengali Input Method for Computer and Mobile Devices

Current mobile devices do not support Bangla (or Bengali) Input method. Due to this many Bangla language speakers have to write Bangla in mobile phone using English alphabets. During this time they used to write English foreign words using English spelling. This tendency also exists when writing in computer using phonetically input methods, which cause many typing mistakes. In this scenario, co...

متن کامل

How to Translate Unknown Words for English to Bangla Machine Translation Using Transliteration

Due to small available English-Bangla parallel corpus, Example-Based Machine Translation (EBMT) system has high probability of handling unknown words. To improve translation quality for Bangla language, we propose a novel approach for EBMT using WordNet and International-Phonetic-Alphabet(IPA)-based transliteration. Proposed system first tries to find semantically related English words from Wor...

متن کامل

English to Bangla Phrase-Based Machine Translation

Machine Translation (MT) is the task of automatically translating a text from one language to another. In this work we describe a phrase-based Statistical Machine Translation (SMT) system that translates English sentences to Bangla. A transliteration module is added to handle outof-vocabulary (OOV) words. This is especially useful for low-density languages like Bangla for which only a limited a...

متن کامل

Romanized Language Identification and Transliteration System for Security with an Authentication System Using Persuasive Cued Click Points - RLITS

Romanized script is popular today for communication in every country, as the script is almost universally enabled in text processors. In countries like India which is a linguistic cauldron, it is very common to see English text in email messages and chat transcripts, with generous sprinkling of words from local languages in roman script. Dubbed as Manglish (Malayalam and English) etc., this rom...

متن کامل

Resource Creation for Training and Testing of Transliteration Systems for Indian Languages

Machine transliteration is used in a number of NLP applications ranging from machine translation and information retrieval to input mechanisms for non-roman scripts. Many popular Input Method Editors for Indian languages, like Baraha, Akshara, Quillpad etc, use back-transliteration as a mechanism to allow users to input text in a number of Indian language. The lack of a standard dataset to eval...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006